Skip to content

feat(firecrawl): add parse operation and revert short-input selection style#4340

Merged
waleedlatif1 merged 4 commits intostagingfrom
waleedlatif1/firecrawl-parse
Apr 29, 2026
Merged

feat(firecrawl): add parse operation and revert short-input selection style#4340
waleedlatif1 merged 4 commits intostagingfrom
waleedlatif1/firecrawl-parse

Conversation

@waleedlatif1
Copy link
Copy Markdown
Collaborator

Summary

  • add Firecrawl /v2/parse operation to the Firecrawl block with file upload support (basic) and file reference (advanced)
  • new internal API route at /api/tools/firecrawl/parse that handles multipart/form-data upload to Firecrawl
  • support all parse params: formats, onlyMainContent, includeTags, excludeTags, parsers, timeout, removeBase64Images, blockAds, proxy, zeroDataRetention
  • revert selection:text-transparent change from fix(short-input): hide selected text to prevent overlay collision #4318 on short-input — did not work as intended

Type of Change

  • New feature

Testing

Tested manually

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 29, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Apr 29, 2026 7:03pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 29, 2026

PR Summary

Medium Risk
Adds a new server route that downloads user files and proxies them to an external API, so regressions could impact uploads, auth enforcement, and error handling. Changes are otherwise additive and localized to the Firecrawl integration and docs.

Overview
Adds a new Firecrawl document parsing capability (firecrawl_parse) that accepts uploaded files or file references, forwards them via a new internal /api/tools/firecrawl/parse route to Firecrawl’s /v2/parse, and returns parsed markdown plus optional summary/HTML/metadata.

Updates the Firecrawl block/UI and tool registry to expose the new operation and its parse-specific options (formats, tag filters, parsers, timeout, ads/proxy/zero-retention), and updates docs/integration metadata accordingly. Also reverts the short-input selection styling tweak by removing selection:text-transparent.

Reviewed by Cursor Bugbot for commit 29580b3. Configure here.

@gitguardian
Copy link
Copy Markdown

gitguardian Bot commented Apr 29, 2026

️✅ There are no secrets present in this pull request anymore.

If these secrets were true positive and are still valid, we highly recommend you to revoke them.
While these secrets were previously flagged, we no longer have a reference to the
specific commits where they were detected. Once a secret has been leaked into a git
repository, you should consider it compromised, even if it was deleted immediately.
Find here more information about risks.


🦉 GitGuardian detects secrets in your source code to help developers and security teams secure the modern development process. You are seeing this because you or someone else with access to this repository has authorized GitGuardian to scan your pull request.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 29, 2026

Greptile Summary

This PR adds a firecrawl_parse operation to the Firecrawl block that accepts uploaded documents (PDF, DOCX, HTML, XLSX, etc.) and returns clean markdown via Firecrawl's /v2/parse API. It introduces a new internal proxy route at /api/tools/firecrawl/parse, a full tool definition in tools/firecrawl/parse.ts, supporting types, block UI sub-blocks (basic file-upload + advanced file-reference), and documentation — plus a one-line revert of a non-functional selection style on short-input.

Confidence Score: 5/5

Safe to merge — all findings are P2 style suggestions with no blocking defects.

The new parse operation follows all established patterns for file-handling, auth, error forwarding, tool registration, and response transformation. No P0 or P1 issues found. Two P2 items: an accidental one-line removal in notion.mdx and a proxy dropdown missing a none entry.

apps/docs/content/docs/en/tools/notion.mdx — verify the description removal is intentional.

Important Files Changed

Filename Overview
apps/sim/app/api/tools/firecrawl/parse/route.ts New internal API route that downloads a stored file, builds a multipart/form-data request, and forwards it to Firecrawl v2/parse; auth, error forwarding, and Zod validation are all in place.
apps/sim/tools/firecrawl/parse.ts New tool definition for firecrawl_parse; request body, transformResponse, and output schema are consistent with the route's response shape.
apps/sim/blocks/blocks/firecrawl.ts Adds Parse Document operation including file-upload/file-reference sub-blocks and all advanced params; proxy dropdown lacks a "none" option.
apps/sim/tools/firecrawl/types.ts Adds ParseParams and ParseResponse interfaces; types are consistent with the API surface.
apps/docs/content/docs/en/tools/notion.mdx Unintentionally removes the description line for notion_add_database_row — appears to be an accidental edit.
apps/sim/app/workspace/[workspaceId]/w/[workflowId]/components/panel/components/editor/components/sub-block/components/short-input/short-input.tsx Reverts selection:text-transparent from #4318 as described in the PR; straightforward one-line change.
apps/sim/tools/registry.ts Registers firecrawlParseTool under the firecrawl_parse key; consistent with other Firecrawl tool registrations.

Sequence Diagram

sequenceDiagram
    participant User as User / Workflow
    participant Block as FirecrawlBlock (parse case)
    participant Tool as firecrawl_parse tool
    participant Route as /api/tools/firecrawl/parse
    participant Storage as File Storage
    participant FC as Firecrawl API (/v2/parse)

    User->>Block: params.document (file-upload or file-reference)
    Block->>Block: normalizeFileInput(params.document)
    Block->>Tool: { file, formats, onlyMainContent, options... }
    Tool->>Route: POST JSON { apiKey, file, options }
    Route->>Route: checkInternalAuth
    Route->>Route: FirecrawlParseSchema.parse(body)
    Route->>Route: processFilesToUserFiles(file)
    Route->>Storage: downloadFileFromStorage(userFile)
    Storage-->>Route: buffer
    Route->>Route: new Blob(buffer) + FormData
    Route->>FC: POST multipart/form-data { file, options }
    FC-->>Route: { success, data: { markdown, ... } }
    Route-->>Tool: { success: true, output: data }
    Tool->>Tool: transformResponse → extract markdown, metadata, etc.
    Tool-->>User: { markdown, summary, html, links, metadata, warning }
Loading

Reviews (2): Last reviewed commit: "fix(firecrawl): forward firecrawl error ..." | Re-trigger Greptile

Comment thread apps/sim/app/api/tools/firecrawl/parse/route.ts
Comment thread apps/sim/app/api/tools/firecrawl/parse/route.ts
@waleedlatif1 waleedlatif1 force-pushed the waleedlatif1/firecrawl-parse branch from b0990fe to 5bb89f4 Compare April 29, 2026 18:50
@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@greptile

@waleedlatif1
Copy link
Copy Markdown
Collaborator Author

@cursor review

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Bugbot reviewed your changes and found no new issues!

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 29580b3. Configure here.

@waleedlatif1 waleedlatif1 merged commit 7d8ec24 into staging Apr 29, 2026
9 checks passed
@waleedlatif1 waleedlatif1 deleted the waleedlatif1/firecrawl-parse branch April 29, 2026 19:04
waleedlatif1 added a commit that referenced this pull request Apr 30, 2026
… style (#4340)

* feat(firecrawl): add parse operation and revert short-input selection style

* chore(firecrawl): regenerate docs and integrations data for parse

* fix(firecrawl): forward firecrawl error body in parse route response

* fix(firecrawl): add pricing config to parse tool hosting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant